Data available through: 2020-06-09
These show the cumulative total of cases and deaths by day. I have also denoted the current cumulative totals. These total values are important; however they are not helpful for figuring out whether the pandemic is slowing down or growing as it is difficult to see trends in cumulative curves like these.
Looking at new cases each day can help us see if the pandemic is slowing. A decreasing number of new cases per day is evidence that the pandemic is slowing down.
There can be a lot of variability in the daily case totals due to a variety of variables. One example is the availability of tests; cases will go down if there is a scarcity of tests and rise dramatically when more tests become available. One way to help get a better sense of the overall trend is by smoothing the data using a moving average.
The trends and raw data show a peak around mid-April and have been moving downward. This may be due to enacting stricter social distancing and lock-downs across the country. There is also a cyclical nature to the daily new cases with counts often being lower on weekends and higher on weekdays.
COVID-19 is much deadlier than the common flu. One way to measure the impact is to look at the death percentage, which is the total number of deaths divided by the total number of cases.
A big concern during April was that the death percentage was continually increasing, even when actual deaths per day were not increasing. Starting in early May the death percentage started to plateau around 6%.
Similar to new cases per day, deaths per day has been trending downward since about mid-April, although there are still spikes. These spikes may be due to reporting times where counts on weekdays are often higher than those on weekends. Spikes also occur when previous deaths are re-assigned as COVID-19 related deaths, such as counting nursing home deaths and/or pneumonia deaths.
The actual values for the previous 14 days are detailed in the table below.
| Date | Total Cases | New Cases | Total Deaths | New Deaths | Death Percentage |
|---|---|---|---|---|---|
| Tue, Jun 09, 2020 | 1,967,925 | 18,706 | 111,359 | 982 | 5.659% |
| Mon, Jun 08, 2020 | 1,949,219 | 17,032 | 110,377 | 567 | 5.663% |
| Sun, Jun 07, 2020 | 1,932,187 | 18,005 | 109,810 | 394 | 5.683% |
| Sat, Jun 06, 2020 | 1,914,182 | 22,946 | 109,416 | 780 | 5.716% |
| Fri, Jun 05, 2020 | 1,891,236 | 28,783 | 108,636 | 1,099 | 5.744% |
| Thu, Jun 04, 2020 | 1,862,453 | 21,813 | 107,537 | 981 | 5.774% |
| Wed, Jun 03, 2020 | 1,840,640 | 19,498 | 106,556 | 999 | 5.789% |
| Tue, Jun 02, 2020 | 1,821,142 | 20,523 | 105,557 | 1,308 | 5.796% |
| Mon, Jun 01, 2020 | 1,800,619 | 21,331 | 104,249 | 651 | 5.790% |
| Sun, May 31, 2020 | 1,779,288 | 18,853 | 103,598 | 610 | 5.822% |
| Sat, May 30, 2020 | 1,760,435 | 22,987 | 102,988 | 849 | 5.850% |
| Fri, May 29, 2020 | 1,737,448 | 24,983 | 102,139 | 1,242 | 5.879% |
| Thu, May 28, 2020 | 1,712,465 | 22,490 | 100,897 | 1,192 | 5.892% |
| Wed, May 27, 2020 | 1,689,975 | 20,057 | 99,705 | 1,392 | 5.900% |
Data available through: 2020-06-09
One important calculation is the growth factor, as outlined in 3Blue1Brown’s youtube video on exponential growth . The growth factor is calculated as follows:
\[ \text{Growth Factor} = \frac{ \text{New-Cases}_N}{\text{New-Cases}_{N-1}} \] where \(N\) is a given day. Essentialy this is taking the amount of new cases today and dividing them by the amount of new cases yesterday.
The growth factor can be very helpful in determining if the pandemic is slowing. If the growth factor is less than 1, this means that the amount of new cases today is less than yesterday. Once there are multiple days with a growth factor less than 1 it is a strong sign that the pandemic is slowing down.
What if there were 0 cases yesterday? This would make the growth factor undefined (or \(\infty\) according to R). This makes it difficult to look at trends. I have adjusted the growth factor so that if the previous day had 0 cases, the current day’s growth factor is equal to the number of new cases:
\[ \text{Growth Factor} = \begin{cases} \frac{ \text{New-Cases}_N}{\text{New-Cases}_{N-1}} & \text{if } \text{New-Cases}_{N-1} \neq 0 \\[1ex] \text{New-Cases}_N & \text{if } \text{New-Cases}_{N-1} = 0 \end{cases} \] I made this adjustment for the early or late stages of the pandemic when the number of cases per day are 0, 1, or 2. However, given the test scarcity and reporting times there are situations in counties or states where there are 0 cases one day and then hundreds or thousands the next day. This large variability causes spikes in the growth factor in some plots.
Similar to the new cases per day, there can be a lot of variability in growth factors In order to get a better sense of the trend I am showing a 14-day moving average of the growth factor.
The growth factor shows a different trend than new cases. Here, the growth factor has stayed around 1 since mid-April. Compare that to the new cases plot on the Overview tab, which shows a downward trend after a peak in mid-April. The growth factor remaining around 1 may be due to the cyclical nature of new cases being reported (high during the week, low during the weekends) - but it could also be showing that although the decrease in new cases is a positive sign, we are not out of the woods yet.
The actual values for the previous 14 days are detailed in the table below.
| Date | Total Cases | New Cases | New Cases 14-Day MA | Growth Factor | Growth Factor 14-day MA |
|---|---|---|---|---|---|
| Tue, Jun 09, 2020 | 1,967,925 | 18,706 | 21,286 | 1.1 | 1.01 |
| Mon, Jun 08, 2020 | 1,949,219 | 17,032 | 21,233 | 0.95 | 1 |
| Sun, Jun 07, 2020 | 1,932,187 | 18,005 | 21,379 | 0.78 | 1 |
| Sat, Jun 06, 2020 | 1,914,182 | 22,946 | 21,569 | 0.8 | 1.01 |
| Fri, Jun 05, 2020 | 1,891,236 | 28,783 | 21,521 | 1.32 | 1.02 |
| Thu, Jun 04, 2020 | 1,862,453 | 21,813 | 21,134 | 1.12 | 0.99 |
| Wed, Jun 03, 2020 | 1,840,640 | 19,498 | 21,430 | 0.95 | 1 |
| Tue, Jun 02, 2020 | 1,821,142 | 20,523 | 21,614 | 0.96 | 1.01 |
| Mon, Jun 01, 2020 | 1,800,619 | 21,331 | 21,620 | 1.13 | 1 |
| Sun, May 31, 2020 | 1,779,288 | 18,853 | 21,642 | 0.82 | 1.01 |
| Sat, May 30, 2020 | 1,760,435 | 22,987 | 21,618 | 0.92 | 1 |
| Fri, May 29, 2020 | 1,737,448 | 24,983 | 21,727 | 1.11 | 1.01 |
| Thu, May 28, 2020 | 1,712,465 | 22,490 | 21,721 | 1.12 | 0.99 |
| Wed, May 27, 2020 | 1,689,975 | 20,057 | 22,029 | 1.12 | 1.01 |
Since most of the COVID-19 measures are enacted by individual states, it may be more helpful for an individual to see the growth factor for the last 14 days in a specific state.
Build your own growth factor plot for a given state and time period by using a shiny app. The app can be access through this link, mareichler.shinyapps.io/diy-covid19-plots/, and is also embedded below:
What if instead of looking at the average of the growth factor, we calculated the growth factor on the average of new cases? I’m going to call this \(\text{GF}_{14}\) to represent that it’s the growth factor based on the 14-day moving average of new cases.
Looking at the new cases 14-day moving average, it is clearly smoother than the raw new cases and most of the cyclical nature has been removed.
Using the new cases 14-day moving average, I am calculating \(\text{GF}_{14}\) as follows:
\[ \text{GF}_{14} = \frac{ \frac{1}{14}\sum_{N-14}^N \text{New-Cases}_i }{ \frac{1}{14}\sum_{N-15}^{N - 1} \text{New-Cases}_i } \]
I am showing both the growth factor 14-day moving average and the \(\text{GF}_{14}\) for comparison.
It turns out in this case it does not make make a significant difference! Even when looking at the growth factor based on the 14-day moving average of new cases (smoothing over the cyclical nature of new cases reporting) the result is a growth factor, \(\text{GF}_{14}\) that is not significantly deviating from 1.
| Growth Factor | Growth Factor 14-day MA | GF_14 | |
|---|---|---|---|
| All dates | 0.6890810 | 0.3117882 | 0.2281827 |
| Since 2020-03-15 | 0.2005171 | 0.1522074 | 0.1217525 |
| Last 14 days | 0.1554176 | 0.0078802 | 0.0081758 |
Looking at the standard deviations, it is unsurprising that the \(\text{GF}_{14}\) has the least amount of deviation, followed by the growth factor 14-day moving average, and the raw growth factor has the most deviation.
Since there is a not a large difference between growth factor 14-day moving average and \(\text{GF}_{14}\), I’m going to use the growth factor 14-day moving average instead of \(\text{GF}_{14}\). When available, I want to keep the calculations as simple and close to the raw data as possible.
Data available through: 2020-06-09
This plot is a quick overview of the last two weeks for each state. Using the three colors you can see which states have seen an average decrease in daily cases for 14 days (i.e. growth factor consistently below 1) and may be ready to start the re-opening process, which states are not ready to reopen and finally, which states may need additional social-distancing and lock-down measures.
As mentioned in the Growth Factor tab, some states will have very large average growth factors. This is most likely due to having 0 cases on one day and then a large number of cases the next day; often caused by test availability or reporting.
This plot shows how the weekly average growth factor has changed by state since the start of the pandemic. Week 1 starts on January 22, 2020, which is the first day data is available. Although 7 days is an arbitrary cut-off point, separating the data into sections and averaging the growth factor helps see the spread of the virus.
Why are some of the growth factors negative? Having a negative growth factor is theoretically impossible; however some states end up having negative growth factors because they update their total case numbers to a lower number. This results in having negative daily cases and then a negative growth factor.
Data available through: 2020-06-09
Although most lock-down measures are enacted at the state level, it is still interesting to look at the growth factor on the county level.
A limitation on this plot is that there are 12,613 cases that have been not assigned to a county. Although that only accounts for 0.641% of the total cases in the United States; it could have a big impact on a given county.
Data available through: 2020-06-09
This data is downloaded from USA Facts. I use two of the three datasets available: total cases and total deaths. Both of these datasets are broken down by state and county. This data requires additional formatting, calculation, and aggregation. USA Facts gets data by county on a daily basis, this is totaled to get values for each day for individual states and the entire US.
The American CDC links to USA Facts under Cases & Death by County, which is how I found the data source.
Many of the plots have been restricted to show data on March 15, 2020 and after. This is when case numbers started to rise and preventative measures started to increase dramatically.
A large limitation for this data is that reported new cases (and thus the growth factor) may not consistently and accurately represent the true number of new cases each day. As mentioned before, this could be due to test availability, reporting protocols, and a number of other variables. It is important to note that this information is a helpful tool in trying to understand the pandemic, but it may not reflect the entire story.